Learning without Default: A Study of One-Class Classification and the Low-Default Portfolio Problem

نویسندگان

  • Kenneth Kennedy
  • Brian Mac Namee
  • Sarah Jane Delany
چکیده

This paper asks at what level of class imbalance one-class classifiers outperform two-class classifiers in credit scoring problems in which class imbalance, referred to as the low-default portfolio problem, is a serious issue. The question is answered by comparing the performance of a variety of one-class and two-class classifiers on a selection of credit scoring datasets as the class imbalance is manipulated. We also include random oversampling as this is one of the most common approaches to addressing class imbalance. This study analyses the suitability and performance of recognised two-class classifiers and one-class classifiers. Based on our study we conclude that the performance of the twoclass classifiers deteriorates proportionally to the level of class imbalance. The two-class classifiers outperform one-class classifiers with class imbalance levels down as far as 15% (i.e. the imbalance ratio of minority class to majority class is 15:85). The one-class classifiers, whose performance remains unvaried throughout, are preferred when the minority class constitutes approximately 2% or less of the data. Between an imbalance of 2% to 15% the results are not as conclusive. These results show that one-class classifiers could potentially be used as a solution to the low-default portfolio problem experienced in the credit scoring domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using semi-supervised classifiers for credit scoring

In credit scoring, low-default portfolios are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitabi...

متن کامل

Investigating The Asymmetric Effects of Macroeconomic Variables on Bank Default Rates During High and Low Default Periods

In recent decades, the high rate of inflation has been one of the concerns of Iran's economy, and one of the main causes of inflation has been the imbalance of banks. The level of non-current claims of banks has been increasing due to the economic recession, credit facilities and the lack of optimal allocation of facilities, and therefore it has unbalanced the balance sheets of banks, hence the...

متن کامل

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

The Current Models of Credit Portfolio Management: A Comparative Theoretical Analysis

The present paper aimed at studying the current models of credit portfolio management. There are currently three types of models which consider the risk of credit portfolio: the structural models (Moody's KMV model, and Credit- Metrics model), the intensity models (the actuarial models) and the econometric models (the Macro-factors model). The development of these three types of models is based...

متن کامل

Dependence of Default Probability and Recovery Rate in Structural Credit Risk Models: Empirical Evidence from Greece

The main idea of this paper is to study the dependence between the probability of default and the recovery rate on credit portfolio and to seek empirically this relationship. We examine the dependence between PD and RR by theoretical approach. For the empirically methodology, we use the bootstrapped quantile regression and the simultaneous quantile regression. These methods allow to determinate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009